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Abstract. We present a link rewiring mechanism to produce surrogates of a network 
where both the degree distribution and the rich-club connectivity are preserved. Wc 
consider three real networks, the AS-Intcrnct, the protein interaction and the scientific 
collaboration. We show that for a given degree distribution, the rich-club connectivity 
is sensitive to the degree-degree correlation, and on the other hand the degree-degree 
correlation is constrained by the rich-club connectivity. In particular, in the case of 
the Internet, the assortative coefficient is always negative and a minor change in its 
value can reverse the network's rich-club structure completely; while fixing the degree 
distribution and the rich-club connectivity restricts the assortative coefficient to such a 
narrow range, that a reasonable model of the Internet can be produced by considering 
mainly the degree distribution and the rich-club connectivity. We also comment on 
the suitability of using the maximal random network as a null model to assess the 
rich-club connectivity in real networks. 



PACS numbers: 89.75.-k, 89.75.Da, 89.75.Fb, 89.20.Hh, 82.39.Rt, 87.23.Ge, 05.70.Ln 
1. Introduction 

In graph theory the degree k is defined as the number of links a node has. The 
distribution of degree P(k) provides a global view of a network's structure and is one of 
the most studied topological properties. Many complex networks are scale-free because 
they exhibit a power-law degree distribution, i.e. P(k) ~ A; -7 , 7 > 1 (HEHSHUIHIEIEIIH]. 
A more complete description of a network's structure is obtained from the joint degree 
distribution P(k, k') 0, [M [EE], which is the probability that a randomly selected link 
connects a node of degree k with a node of degree k! . The degree distribution can be 
obtained from the joint degree distribution: P(k) = (k/k) ^2 k , P(k, k'), where k is the 
average degree. 

The joint degree distribution characterises the degree-degree correlation [T2], [13] 
between two nodes connected by a link. But in practice P(k, k') can be difficult 
to measure, in particular for a finite-size and scale-free network [14] . Nevertheless 
the degree-degree correlation can be inferred from the average degree of the nearest 
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neighbours of fc-degree nodes [TUJ [T5J US], which is a projection of the joint degree 
distribution given by 

fcmW " fcP(Jfc) • lij 
If the nearest-neighbours average degree k nn is an increasing function of k then the 
network is assortative, where nodes tend to attach to alike nodes, i.e. high-degree 
nodes to high-degree nodes and low-degree nodes to low-degree nodes. If k nn {k) is a 
decreasing function of k then the network is disassortative, where high-degree nodes tend 
to connect with low-degree nodes. A network's degree-degree correlation, or mixing 
pattern, can also be summarised by a single scalar called the assortativity coefficient a, 
-1 < a < 1 [121, 
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where L is the total number of links, are the degrees of nodes i and j, and is an 

element of the network's adjacency matrix, where ay = 1 if nodes i and j are connected 
by a link otherwise = [IT] . For an uncorrelated network a = 0, for an assortative 
network a > and for a disassortative network a < 0. 

In some scale-free networks the best connected nodes, rich nodes, tend to be very 
well connected between themselves. A rich-club is the set of nodes 1Z >k with degrees 
larger than a given degree k. The connectivity between members of the rich-club is 
measure by the rich-club connectivity [18], which is defined as the ratio of the number 
of links E >k shared by the nodes in the set lZ >k to the maximum possible number of 
links that the rich nodes can share, 

m = \n >k \ • (|tC*I - 1)/2 = \n >k \ ■ (\n >k \ - 1) . § (3) 

where |7^>fc| is the number of nodes in the set TZ >k [HJ [19]. The rich-club connectivity 
as a function of the degree is a global property of a network. The rich-club connectivity 
is a different projection of the joint degree distribution 



m = — ^Wiwnfc.fc \ — (4) 

[N Efea nk')] ■ [N P(k') - 1] 

where is the total number of nodes and k max is the maximum degree in the network. 
The rich-club connectivity and the degree-degree correlation are not trivially related. 

Our motivation here is twofold. First to study if the description of a network using 
P(k) and <p(k) produces a reasonable model of a real network. We consider three real 
networks, the AS-Internet, the protein interaction and the scientific collaboration. Our 
approach is, from a real network, to create surrogate networks with the same P(k), 
or even the same <f>(k), as the original network, and then compare properties of the 
surrogates with the original network. Second, we are interested in the properties of the 
surrogates, in particular the maximal random case of a network, as it has been used as 
a 'null model' to assess network properties. 



Structural constraints in complex networks 



3 




(a) Assortative wiring 



Preserving P(k) 




Preserving P(k) 
< > 




(b) Disassortative wiring 
71 



_^^ // a Preserving P(k) and (|)(k) 

e 2 e, 



(c) Neutral wiring 

Figure 1. The four end nodes of a pair of links can be reconnected in three 
wiring patterns: (a) assortative wiring, where one link connects the two nodes with 
larger degrees and the other link connects the two nodes with smaller degrees; (b) 
disassortative wiring, where one link connects the node with the highest degree with 
the node with the lowest degree and the other link connects the two remaining nodes; 
and (c) neutral wiring. 



2. Link Rewiring Algorithms 

We create surrogate networks by using the link rewiring algorithms [20], [21] . 
2.1. Maximal Cases I: Preserving P(k) 

The broad degree distribution P(k) is an important characteristic for complex networks 
and it should be preserved by any link rewiring process [22J . Figure Q] shows that any 
four nodes with degrees k± > k 2 > k 3 > fc 4 can be connected by two links in three 
possible wiring patterns. One can see that reconnecting a pair of links from one wiring 
pattern to another preserves the degree of individual nodes and therefore preserves the 
degree distribution P(k). It is possible to obtain different kind of surrogate networks 
by rewiring links in the following ways. 

• Maximal random case J: randomly choose a pair of links and swap two of their 
end nodes. This is equivalent to reconnect the four end nodes using a wiring pattern 
chosen at random. The process is repeated for a sufficiently large number of times. 

• Maximal assortative case /: reconnect a pair of links in the assortative wiring 
pattern (see figure [H(a)) and repeat the process until all link pairs are assortative 
wired. 
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• Maximal disassortative case I: similarly, reconnect all pairs of links using the 
disassortative wiring pattern (see figure [D(b)). 

For all link rewiring algorithms in this paper, a pair of links can be rewired only if 
the resulted graph remains as a single connected component. 

2.2. Maximal Cases II: Preserving Both P(k) And 4>(k) 

It is possible to modify the link rewiring process such that the rich-club connectivity 
is preserved as well. For a given degree k the rich-club connectivity <f)(k) depends on 
the number of links shared by the nodes belonging to the set IZyk- Any rewiring 
between nodes belonging to TZyk, or between nodes outside TZ > k, will not change E>k 
hence 4>{k) will remain the same. As shown in figure [U E > k l , E>k 2) E y k 3 and E > k i in 
the disassortative wiring (figure QJb)) and the neutral wiring (figure QJc)) are the same, 
because the link e\ only and always belongs to E>k 4 , and the other link e2 only and 
always belongs to E^ and E > /, i . This means that when reconnecting a pair of links 
between the disassortative wiring and the neutral wiring, 4>(k) remains unchanged for 
all degrees. This allow us to obtain a different set of maximal cases for a network while 
preserving both the network's P(k) and <f)(k). 

• Maximal random case II: if a chosen pair of links are assortatively wired, they 
are discarded and a new pair of links is selected; otherwise the four end nodes are 
reconnected using either the disassortative wiring or the neutral wiring at random. 

• Maximal assortative case II: if a pair of links are not assortatively wired, the 
four nodes are reconnected using the neutral rewiring, which will produce a more 
assortative mixing than using the disassortative wiring. The process is repeated for 
all pairs of links. 

• Maximal disassortative case //: if a pair of links are not assortatively wired, the 
four nodes are reconnected using the disassortative wiring. The process is repeated 
for all pairs of links. 

3. Results 

Table 1 describes the data sets and some of their topological properties. Figure [2(a) 
shows that the three networks have a power-law decay in P(k). The degree distribution 
of the Internet is well approximated by P(k) ~ A; -7 , 7 ~ 2.24 [?], and it exhibits 
a fat tail where the maximum degree, k max = 2070, is larger than the power-law 
natural cut-off degree k cut = 1573. The degree distribution of the protein interaction 
and the scientific collaboration deviates from a strict power-law and have short tails. 
Figure E^b) shows that the scientific collaboration exhibits the assortative mixing 
behaviour, which is common in social networks. The Internet and protein interaction 
exhibit the disassortative mixing behaviour, which is typical for technological and 
biological networks. The mixing behaviours are also confirmed by evaluating the 
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Table 1. Three real networks considered are: (a) the Internet network at the 
autonomous system (AS) level [5j US EH EH ES ES [27] from data collected 
by CAIDA [28], in which nodes represent Internet service providers and links 
connections among those; (b) the protein interaction network [6] [29] of the 
yeast Saccharomyces cerevisiae (http://dip.doe-mbi.ucla.edu/); and (c) the scientific 
collaboration network [30l EI] , in which nodes represent scientists and a connection 
exists if they coauthored at least one paper in the archive. The three networks contain 
multiple components. In this paper we study the giant component of the networks. 
We show the following properties: the number of nodes N and links L in the giant 
component, the average degree k = 2L/N, the power-law exponent 7 by fitting P(k) 
with fc -7 for degrees between 6 (the average degree) and 40, the maximum degree k max , 
the power-law natural cut-off degree k cut — iV 1 /' 7-1 ) [9], the assortative coefficient a, 
the rich-club connectivity 4>(k > 4o), the rich-club exponent obtained by fitting <j)(k) 
with k e for degrees between 6 and 40, the size of rich-club clique n c ii que , the average 
shortest path length £, and the average shortest path length expected in a random 
graph t = \nN/\nk [5J. 





Internet 


Protein 


Scientific 






interaction 


collaboration 


Number of nodes N 


9,200 


4,626 


12,722 


Number of links L 


28,957 


14,801 


39,967 


Average degree k 


6.3 


6.4 


6.3 


Power-law exponent 7 


2.24 


2.14 


2.90 


Maximum degree k max 


2,070 


282 


97 


Power-law cut-off degree k cut 


1,573 


1,641 


145 


Assortative coefficient a 


-0.236 


-0.137 


0.161 


Rich-club connectivity </>(/c>4o) 


26.8% 


6.4% 


7.1% 


Rich-club exponent 9 


1.52 


0.97 


1.94 


Rich-club clique n c i ique 


16 








Average shortest path length I 


3.1 


4.2 


6.8 


Expected in a random graph I* 


5.0 


4.5 


5.1 



assortative coefficient of the networks (see a in table 1). Figure^c) shows that the three 
data sets exhibit different rich-club structures. Rich nodes in the disassortative Internet 
are significantly more tightly interconnected with each other than in the assortative 
scientific collaboration. Only the Internet contains a rich-club clique where the top 16 
richest nodes are fully connected with each other (see n c i ique in table 1). One can see 
that an assortative network does not always exhibit a strong rich-club structure, neither 
does a disassortative network always lack a rich-club structure. Indeed high-degree 
nodes have very large numbers of links and only a few of them are enough to provide 
the connectivity to other high-degree nodes, whose number is anyway small [5]. 

A relevant metric of a network is the average shortest path length i between 
all nodes. As shown in table 1 the average shortest path length in the Internet is 
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Figure 2. Topological properties of the Internet, the protein interaction and the 
scientific collaboration networks: (a) the degree distribution, P(k); (b) the nearest- 
neighbours average degree of /c-degree nodes, k nn (k); and (c) the rich-club connectivity 
as a function of degree, 4>(k). 

significantly smaller than the average shortest path length expected in a random graph 
with the same network size. The Internet is so small [32] because it exhibits both a strong 
rich-club structure and a strong disassortative mixing behaviour. While members of the 
rich-club are tightly interconnected with each other and they collectively function as a 
'super' traffic hub for the network, the disassortative mixing ensures that the majority 
of the network nodes, peripheral low-degree nodes, are always near the rich-club core. 
Thus a typical shortest path between two peripheral nodes consists of three hops, the 
first hop is from the source node to a member of the rich-club, the second hop is 
between two club members and the final hop is to the destination node. One can see 
that a combination of the degree-degree correlation and the rich-club connectivity can 
also explain the distribution of the hierarchical path [33] and the short cycles [20] in a 
network. 

Figure [3] shows the range of the assortative coefficient a obtained by the link 
rewiring algorithms preserving the degree distribution (case I) against that preserving 
both the degree distribution and the rich-club connectivity (case II). The maximal 
random case of a real network is averaged over 40 surrogate networks, each of which is 
obtained by repeating the appropriate link rewiring process for 1000 x L times, where 
L is the total number of links in the network. 

For case / preserving P{k), the maximal random rewiring of the protein interaction 
and the scientific collaboration almost decorrelates the networks, and the assortative and 
disassortative rewiring can produce surrogate networks in the range from assortative to 
disassortative. This is in contrast to the Internet, where the maximal random case is 
almost as disassortative as the original data. In fact all the surrogate networks produced 
by rewiring the Internet are disassortative, the assortative coefficient is always negative 
and its value is restricted to a very small range. This behaviour of the Internet is 
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o Original data 
| Maximal random case 
Maximal assortative case 
Maximal disassortative case 

Preserving P(k) 

— Preserving P(k) and (|)(k) 

-1 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1 
Assortative coefficient OC 

Figure 3. Range of the assortative coefficient a of the three networks under study 
obtained by the link rewiring algorithms preserving P(k) (case /) comparing with that 
preserving both P(k) and 4>(k) (case 77). The inset shows the enlargement for the 
Internet. The standard deviation of a for a maximal rewired case is smaller than the 
symbol representing it. 

due to the restriction of having a finite network that has a power-law decay in its 
degree distribution and that the maximum degree is larger than the natural cut-off 
degree PIT4"]. 

For case II preserving both P(k) and 4>{k), the range of a is narrower than case I 
when only fixing P(k). This result confirms the analytical analysis by Krioukov and 
Krapivsky [31] that the rich-club connectivity constrains a network's degree-degree 
correlation. In the case of the Internet, the assortative coefficient is restricted to an 
even smaller range. This observation suggests that a reasonable model of a real network 
can be produced by modelling the degree distribution and the rich-club connectivity, 
e.g. the Positive-Feedback Preference (PFP) model [27 1 1351 136] for the Internet. 

Figure H] shows the rich-club connectivity of the three networks each compared with 
their three maximal cases I obtained by preserving P{k). The rich-club connectivity 
changes dramatically due to the rewiring. For all the maximal assortative networks there 
is a notable increase of 4>(k) throughout all degrees and all contain a fully connected 
rich-club clique which consists nodes with degrees larger than 78, 48 and 46 for the 
Internet, the protein interaction and the scientific collaboration respectively. For all the 
maximal disassortative networks there is a complete collapse of the rich-club structure 
such that there is no single link shared between nodes with degrees larger than 23. This 
suggests that networks with the same degree distribution can have very different rich- 
club connectivity. In other words the degree distribution does not constrain the rich- 
club connectivity. The rich-club connectivity is sensitive to the change of a network's 
degree-degree correlation. For the Internet, a minor change in the assortative coefficient 
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Figure 4. Rich-club connectivity <p(k) of (a) the Internet, (b) the protein interaction, 
and (c) the scientific collaboration, comparing with the three maximal cases / obtained 
by preserving P{k). 

within the narrow range of a e (—0.218, —0.275) could reverse the rich-club structure 
completely. This highlights the importance to measure the rich-club connectivity when 
evaluating a network model. 

4. Discussion 

The maximal random network obtained by preserving P(k) has been used to discern 
whether the existence of an interaction between two proteins is due to chance or not [6]. 
To do such, the probability that two nodes share a link in the protein interaction network 
is compared against the probability that the same two nodes will share a link in the 
maximal random network. The maximal random network is used as a null model because 
in this case it is almost a decorrelated network (see figure [3]) . 

Recently the maximal random network has also been used as a null model to 
detect the origin of the rich-club connectivity in real networks [19]. The argument 
is that if the rich-club connectivity of the original network is the same as that of 
the maximal random network then the rich-club connectivity was created by chance, 
otherwise there was an 'organisational principle' responsible for the existence (or the 
lack) of the rich-club structure. In the case of the Internet, the original network^ and the 
maximal random network have similar rich-club connectivity (see figure H(a)), then the 
conclusion in Ref. [191138"] was that 'hubs in the Internet . . . are not tightly interconnected' 
and 'the Internet does not have an oligarchic structure whereas, for example, scientific 
collaborations do 1 . However, as shown in figure E](c), the Internet does contain a well 
connected rich-club core and we do not need more statistical analysis to support this 
observation. 

To understand the problem of using the maximal random network of the Internet 

\ [19] used the Internet data collected by the Route Views project [37] , which exhibits a similar rich-club 
structure as the Internet data collected by CAIDA used in this paper. 
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as a null model, one need to realise that the maximal random network in this case is 
not an uncorrelated network. On the contrary it is a strongly correlated network and is 
almost as dissasortative as the Internet. Rich nodes in both the original network and 
the maximal random network are tightly interconnected, and the similarity between the 
rich-club connectivity of the two networks does not implies that the Internet lacks a 
rich-club structure. 

Notice that the maximal random network for the Internet with P(k) fixed is 
more dissasortative than the original network, and the maximal random network with 
P(k) and <fi{k) both fixed is even more dissasortative (see inset in figure 3). This 
suggests that the rich-club structure depends strongly on the nature of the degree- 
degree correlation and it was not formed by chance. This strong dependence on the 
tail of the degree distribution (k max ) and the degree-degree correlation has also been 
noted in the estimates of large cliques that appear in random scale-free networks [39|. 
A more detail analysis of the null-model of the rich-club connectivity will be published 
elsewhere. 

5. Conclusions 

The rich-club connectivity and the degree-degree correlation describe the global 
structure of a network from different perspectives. We show that for a given degree 
distribution, the rich-club connectivity is sensitive to the degree-degree correlation, 
and on the other hand the degree-degree correlation is constrained by the rich-club 
connectivity. In particular for the case of the Internet, the assortative coefficient 
is always negative and a minor change in its value can reverse the network's rich- 
club structure completely; if fixing both the degree distribution and the rich-club 
connectivity, the assortative coefficient is restricted to such a narrow range that a 
reasonable model of the Internet can be produced by considering mainly the degree 
distribution and the rich-club connectivity. 

We also clarify some misinterpretations that have appeared in the literature which 
use the maximal random case as a null model to assess the rich-club connectivity in real 
networks. We remark that some care is needed to avoid reaching misleading conclusions, 
in particular when studying the Internet. 
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